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ABSTRACT 

The problem raised by incremental encryption is the overhead due to the larger storage space required 
by the provision of random blocks together with the ciphered versions of a given document. Besides, 
permitting variable-length modifications on the ciphertext leads to privacy preservation issues. In this paper 
we present incremental encryption schemes which are space-efficient, byte-wise incremental and which 
preserve perfect privacy in the sense that they hide the fact that an update operation has been performed 
on a ciphered document. For each scheme, the run time of updates performed turns out to be very efficient 
and we discuss the statistically adjustable trade-off between computational cost and storage space required 
by the produced ciphertexts. 
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I. Introduction 

Incremental cryptography, introduced by Bellare, Goldreich and Goldwasser in [1], [2], [3], is used to 
maintain up-to-date outputs of cryptographic algorithms at low computational costs. Given the encryption 
of the current version of a file, it is preferable to avoid to recompute from scratch the encryption algorithm 
applied to the entire file whenever a modification, often minor, is performed on that file. Such a low 
computational efficiency for update operations finds application in various situations, as for example when 
we want to maintain constantly changing databases or editable documents on remote servers, such as for 
managing remote storage in secure clouds or mobile embedded networks. More precisely, when a document 
stored on a server is accessed concurrently by several users who perform some modifications on it, it is 
preferable for a given user that the running time to perform a local modification on the file depends as 
little as possible of the modifications performed by other users as well as the size of the updated document. 
Ideally, the run time of an update should be proportional to the amount of data changed. When a traditional, 
non-incremental cryptographic algorithm is used to perform a modification on the file, a critical access has 
to be considered and preserved for the entire file whenever a modification is performed. Note that having a 
"byte-wise" or "bit-wise" incremental scheme is primarily of importance. Even though this latter convention 
is more general, "byte-wise" incremental updates are adapted to changes in a document which often involves 
replacements, insertions or deletions of a bytes string. Moreover, an other essential interest of incremental 
cryptographic schemes is their inherent parallelism which will allow the use of multi-processors/cores for 
improving performances whenever required by the applications. 

Related works: The standard authenticated encrytion mode GCM (Galois/Counter Mode [4]) does 
not support secure replace operations (although the standalone authentication scheme GMAC [5] does). 
Some other standard encryption or authenticated encryption schemes (such as XEX, XTS [6], OCB [7] 
or inc-IAPM [8]) support at best replace operations because they include a form of block indexation. A 
few incremental encryption schemes supporting efficient insert operations exist. We can distinguish one 
mode defined in [2] and two modes defined in [8]. Moreover, the first one [2] is not oblivious, that is to 
say, one can distinguish between a new ciphered document and an updated one. Computational efficiency 
is not really a problem for incremental encryption schemes. Indeed, in best cases the run time of an 
update is proportional to the amount of blocks changed (or inserted) in the document and remains constant 
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when deleting blocks. The real problem of current incremental encryption schemes relates to the too large 
expansion of a ciphered document due to the provision of random bit strings. For instance, the best secured 
mode rECB [8] (for randomized ECB) produces ciphertexts that are twice larger than plaintexts and the 
authenticated encryption mode RPC [8] produces ciphertexts even much larger. Besides, all these modes do 
not allow insertion of arbitrary sized bytestrings without the need to re-align and recipher all subsequent 
data blocks. Some schemes supporting efficient variable-length data insertions (or deletions) by insuring 
the oblivioussness property use a synchronisation scheme of random walks [9]. Nevertheless, the method 
described in [9] does not solve the problem of cryptographic form sizes. 

Contributions: The AsiaCCS paper [9] has quickly introduced a generic construction to extend a 
block-wise incremental cryptographic scheme into a fully byte-wise one. The focus of the present paper is 
to deal with byte-wise incremental encryption schemes which produce smaller ciphertexts and still ensure 
the privacy of modifications: 

• The first one is an incremental block-based encryption scheme having the ability to produce a size 
overhead of about (only) n bits for a document of n blocks. Its extension into a byte-wise incremental 
scheme can be done following the method described in [9]. 

• The method of [9] proposed to use a block-based incremental scheme as a black box without worrying 
about the sizes of produced cryptographic forms. We show that the same approach can be used 
to design byte-wise incremental cryptographic schemes that alleviate this problem. Contrary to the 
approach taken in [9], we describe here a scheme which relies on a stateless mode of encryption, 
which is used as a fine-grained primitive. For a particular distribution of probability, we give a new 
tight upper bound of the number of block-cipher evals needed when performing an update and discuss 
about the trade-off between average ciphertexts size and efficiency of the update. 

• Then a most interesting third solution combining the advantages of the previous schemes is proposed. 
If we can not fully and extensively describe the algorithms by lack of space, the first constructions 
are presented in a logical, incremental manner that allows to easily deduce them. 

• We also give proofs of their ind-CPA security. Since this paper focus on (non authenticated) encryption 
schemes, chosen plaintext attacks are the best attacks we can prevent against. 

• For these schemes offering the same security levels, we give a brief comparison in terms of space 
and time efficiencies. 

• We discuss their extensions into authenticated encryption schemes. In particular, we notice that when 
composing them with incremental MAC, the resulting incremental authenticated encryption schemes 
are more space-efficient than previous solutions [8]. 

Outline of the paper: The rest of this paper is organised as following. Required preliminaries such 
as, among others, precise security definitions, are given in Section 2. Our incremental modes of encryption 
with their efficiency analysis are each described in Section 3, 4 and 5 respectively. The proofs of security 
are given in Section 6. Finally, Section 8 concludes this paper. 

II. Background and definitions 

Encryption schemes take as input a document D which is usually divided into a sequence of fixed-size 
blocks <7i, a n . Cryptographic schemes were defined til now thanks to a so-called mode of operation over 
such blocks. Thus, documents were viewed as strings over an alphabet J2b = {0> ^-} 8N where N is the 
given block- size in bytes. 

Let us denote by T* a set of probability measures 0 on the set S — {1, L}, where 4>(i) > 0 V % e S. 
For byte-wise incremental schemes, the bytestring D is divided into variable sized blocks (Bi) i= i__ n whose 
corresponding lengths (uj)j=i.. n follow a strictly positive and discrete probability distribution 0. Documents 
are then viewed as strings over an alphabet J2 b = {0, l}f ^ ^ 

A. Updating documents 

The space of modifications Mb defined so far was a block- wise space of modification [8]. This one 
allows operations such as [delete, i), (insert, i, P*) or even (replace, i, P*) corresponding respectively to 
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the deletion of the i-th block, the insertion of a block P* just after index i or the replacement of the i-th 
block by P*. 

For our purpose, we have to define a byte- wise space of modification M. b allowing fine-grained operations 
M = (substitute, i,j, (5). Such an operation substitutes byte i + 1 to byte j — 1 (included) by f3, a bytestring 
of any length (possibly empty). We use hereafter the following conventions: 

• If j — i + this operation corresponds to an insertion just after byte i; 

• If (3 is empty, this operation corresponds to a deletion from byte i + 1 to byte j — 1 (included); 

• If 1 0\ — j — i — 1, this operations corresponds to a replacement from byte i + 1 to byte j — 1 (included) . 
Note that these conventions are not abusive, considering a n-byte string document D = b\b 2 ■ ■ ■ b n , a modifi- 
cation (substitute, i,j, (5) can be interpreted as taking the string b]b 2 ■ ■ ■ bi_ibibjbj + i . . . b n _ib n and inserting 
(3 just after the byte index i so that we obtain the new document D' = b\b 2 . . . bi^ibi(3bjbj + i . . . b n _\b n . The 
resulting document after a modification operation M is denoted D(M). The resulting document after the 
ordered modification operations Mi, M 2 , M, L is denoted D(M 1 , M 2 , M t ) where D(M 1 , M 2 , M t ) = 
(((D(M 1 ))(M 2 ))...)(Mi)- 



B. Incremental mode of encryption 

Definition 1: An incremental encryption scheme is specified by a 4-tuple of algorithms ^ = (Q, £,X, V) 
in which: 

• Q, the key generation algorithm, is a probabilistic polynomial time algorithm which takes as input a 
security parameter k and returns a symmetric key K. 

• S, the encryption algorithm, is a probabilistic polynomial time algorithm which takes as input K and 
a document D E J2 + and returns the ciphertext C = E K (D). 

• X, the incremental update algorithm, is a probabilistic polynomial time algorithm which takes as input 
a key K, (a document D), a modification operation M E M, and the encrypted form C (related to 
D) and returns the modified ciphertext C. 

• V, the decryption algorithm, is a deterministic polynomial time algorithm which takes as input a key 
K and a ciphertext C = E K (D) and returns a document D. 

The behaviour of an incremental encryption scheme is depicted in the commutative diagram Figure 1. Con- 
sidering a modified document D' = D(M), it is required that V K (X K (M,D,£ K (D))) = D'. Note that the 
input document D is shown in brackets in the update 
algorithm of the above definition. The reason is that, 
depending on the fact whether we have a block-based 
or a byte-wise incremental encryption scheme, as well 
as the convention used in the implementation, X may or 
may not require access to the document D. 

For example, let us consider the use of a random 
permutation E K , where a specific instantiation will be 
a 16-byte block cipher such as AES. We then recall 
the encryption phase rECB of a block-based incremental 
authenticated encryption scheme defined according to the 
encrypt-then-MAC composition in [8]. This scheme has 
the property to be perfectly private. Given a document D 
parsed as a sequence of 16-byte blocks D 1: . . . , D n , the 
encryption algorithm £ is constructed in the following 
way: 

1) For each i — 0, ...,n we pick uniformly at random. The randomized input is R = r 0 ri...r n . 

2) Let Cq = E K (r 0 ). For each i — 1, ...,n let d = (E K (ri © r 0 ), E K (ri © A))- The ciphertext is 
C = C§C\...C n . 




X K ,M 



Fig. 1: Incremental mode of encryption 
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The encrypted message is simply C. Concerning the incremental algorithm X, an insertion (or replacement) 
of one block is simply done by generating a new random value r* (or respectively regenerating the existing 
one) and performing two block-cipher operations. 

Table 1 I summarises efficiency of schemes proposed by [8] supporting delete and insert operations. Note 
that RPC is an authenticated encryption scheme following the model of integrated checksum. Concerning 
rECB-XOR, if we extract the standalone encryption scheme rECB from rECB-XOR, one can notice that 
there is no need to cipher the random values, indistinguishability will be always ensured even without 
this overhead of encryption. The real problem with incremental cryptography is that we have to generate 
and supply a lot of random values to remove dependencies between the ciphered blocks and thus allowing 
efficient update operations. This can result, as in the case of rECB, in ciphertexts twice as long as plaintexts. 



Algorithms 


Ciphertext 
size 


Block cipher 
evals 


Indistinguishability 
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TABLE I: Summary of incremental modes of operation proposed by Buonanno et al, instantiated with a 128-bit block cipher. 
The parameters n, I and [i are expressed in number of blocks. Queries (qi nc ) to the update oracle concern only one block. 



C. Indistinguishability 

For a traditional encryption scheme, indistinguishability measures the un-ability for an adversary to 
distinguish ciphertexts. In the case of an incremental encryption scheme, we consider that an adversary must 
in addition be unable to distinguish modifications performed on a same ciphertext. Besides, for insuring 
indistinguishability of modifications, some stringent conditions have to be applied. Indeed, if the incremental 
algorithm X has a good running time complexity (for instance, linear in the amount of changes), some parts 
of a modified ciphertext remain unchanged. So, we require these modifications to be of the same type, 
same lengths and performed at the same location in the plaintext. Otherwise the adversary makes no effort 
to distinguish between them by looking the resulting updated ciphertext. 

We define an adversary A in a find-then-guess game where the incremental update algorithm is taken 
into account. This model of security was described for the first time in [8] and defines such an adversary 
as a two-phase algorithm: 

1) "find" phase : A makes queries to its encryption oracle €k(-) an d updating oracle Zr- (.,.,.) and 
eventually submits to the challenger either a pair of distincts chosen plaintext (D 0 , Di) of same 
length I, or a ciphertext C with a pair of modification operations (M 0 , Mi) (of the same type and 
modifying the same location). 

2) "guess" phase : The challenger selects a bit b e {0, 1} uniformly at random, encrypts the message 
D b with £ K or applies the update M b on C with X K , and the result is given to A which may then 
make more oracle queries. Finally A must output a guess for the value of b. 

We are interested in the property of indistinguishability under an adaptive-chosen-plaintext attack (CPA). 
The adversary wins if it correctly identifies which of the two documents has been ciphered (or which 
one of the two modifications has been applied) in the challenge. The encryption scheme is said to be 
secure if reasonable adversaries cannot win significantly more than half the time. In the following, we 
define precisely two experiments, one for ciphertext indinstinguishability and an other one for modification 
indi stingui shability. 



'Since we will refer to these elements of comparison in the rest of the paper, please note that the number n can represent the size of the 
document either in number of 16-byte blocks or in number of bytes and change the number of cipher evals as appropriate. 
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Message secrecy: Let ^ be an incremental encryption scheme whose security parameter is k. Let 
A = (Ai, A 2 ) be an adversary that has access to the oracles €k{-) an d X K (., ■, •)• Now, let us consider the 
following random experiment: k being set to a fixed value, we run the algorithm Q to obtain a secret key 
K. The algorithm A 1 takes as input k and outputs a triplet (D 0 , D 1: s). The components D 0 and D 1 are two 
distinct plaintexts of same length I, the component s contains information about the system state, known by 
Ai, which will be passed to A 2 . A bit bo is chosen uniformly at random in {0, 1} and is kept secret from 
A 2 . The encryption algorithm £ K is then launched over the plaintext D bo and returns a ciphertext C which 
constitutes the challenge ciphertext proposed to the algorithm A 2 . Thus, A 2 has as inputs (D 0 , D 1: s, C). 
Eventually, the algorithm A 2 outputs a bit b (whose the adversary hopes it equal to b 0 ). 

Keeping in mind these notations, the random experiment described previously can be detailed in the 
following way: 



Expt^ A (A,k) 

k <-g(k) 

{D 0 ,D 1 ,s)^A 1 (k) 
{D 0 and D 1 are distincts and of same length) 

bo <- {0, 1} 

C 4r- £ K (D bo ) 

A 2 (D 0 ,D 1 ,s, C) 
\\b = b Q then return 1 else return 0 



Definition 2: Let ^ = (Q,£,T,V) be an incremental encryption scheme over modification space M., 
and let A be an adversary for an attack CPA. We define the adversary advantage Adv^ CPA by : 



Adv^ CPA {A,k) = 



Pr (Ex P t£% A (A,k) = l)-± 



We say that \I/ is (t, q e , fi, qi nc , e)-secure in the sense of I-CPA if, for any adversary A which runs in time 
t, making q e queries to the £r{-) oracle and qi valid queries to the X K {., ., .) oracle (with the total number 
of ciphered data 2 in all encryption and incremental update queries equal to //), Adv^' CPA (A, k) is less than 
e. 

Update secrecy: Let ^ be an incremental encryption scheme whose security parameter is k. Let 
A = (Ai, A 2 ) be an adversary that has access to the oracles £r{-) and X K {., .,.). Now, let us consider the 
following random experiment: k being set to a fixed value, we run the algorithm Q to obtain a secret key K. 
The algorithm A\ takes as input k and returns a 4-tuple (C, M 0 , M±, s) where M 0 and Mi are two possible 
modifications on D (of the same type and modifying the same location) and where s contains information 
about the system state, known by Ai, which will be passed to A 2 . A bit b 0 is chosen uniformly at random 
in {0, 1} and is kept secret from A 2 . The algorithm of modification X K is then launched over M bo , D, C 
and returns an updated ciphertext C = XK(M bo , D, C). The ciphertext C constitutes the challenge update 
proposed to the algorithm A 2 . Thus, the algorithm A 2 takes as inputs (M 0 , Mi, s, D, C, C) and outputs a 
bit b (whose the adversary hopes it equal to bo). 

Keeping in mind these notations, the random experiment described previously can be detailed in the 
following way (IM-CPA stands for indistinguishability of modifications under CPA attack): 



2 An appropriate unit of measurement is selected, as the case might be. 
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Exp4 M F -g PA (A,k) 

K <- Q(k) 

(C,M 0 ,M 1 ,s)^A 1 (k) 
(Mo = (substitute, io, jo, Po) 3nd 
Mi = (substitute, i 1 , ji, /3i) are such 
thatio = h, jo = ji and \/3 0 \ = 
b 0 <- {0, 1} 
C ^l K (M bo ,D,C) 
b^ A 2 (Mo,M l ,s,D,C, C) 
\\b = bo then return 1 else return 0 



Definition 3: Let ^ = (Q,£,X, V) be an incremental encryption scheme over modification space J\4, 
and let A be an adversary for an attack CPA. We define the adversary advantage Adv^ M ' CPA by : 



Advi M - CPA (A,k) = 



Pr(Expti M F S PA (A,k) = l)- 1 - 



We say that ^ is (t, q e , p,, qi nc , e)-secure in the sense of IM-CPA if, for any adversary A which runs in 
time t, making q e queries to the £r(-) oracle and q { valid queries to the X K (.,.,.) oracle (with the total 
number of ciphered data in all encryption and incremental update queries equal to p), Adv LM ' CPA (A, k) is 
less than e. 

We provide in Table I the security of the schemes from [8]. As can be observed, the combined au- 
thenticated encryption rECB-XOR and the standalone encryption rECB are secure. However, this is not 
the case for RPC which, moreover, suffers from a high expansion of the ciphertext. If this expansion is 
parametrisable, trying to reduce it worsens the security of the scheme. 



D. Perfect privacy 

A simple way to design an incremental encryption algorithm is the following. Instead of applying 
modifications to the plaintext message and recipher the new one from scratch, take the current version 
of the ciphertext and append a ciphered description of modifications to obtain the new ciphertext. Such a 
scheme may be not acceptable since it is not history free in the sense that it can reveal to someone who 
knows the decryption key all previous versions of the document. Moreover, this method is not efficient and 
produces a ciphertext which is becoming larger at each modification. 

Suppose Alice sends a ciphertext to Bob. The latter might be disappointed if he realizes that the ciphered 
document has been obtained by an incremental update. This problem could arise with documents such 
as commitments, contracts whose cryptographic forms must not disclose any information at all about 
modifications, not even the fact that update operations have been performed. 

We will say that an incremental encryption scheme is oblivious (or perfectly private) if the behaviour 
of the composition of application of the encryption algorithm followed by incremental update operations 
is indistinguishable from the behaviour of the application of the encryption algorithm alone. A formal 
definition, taken from [10], [8], is the following: 

Definition 4: Let ^ be an incremental encryption scheme over modification space M.. We say that ^ is 
oblivious (or perfectly private) if, for any two documents D, for any sequence of modifications M ls 
Mi e M such that A = D(M 1 , M;), and for all keys K, we have 

{£ K (Di)} = {Ik (Mi, Di, ...,I K (M 1 , D, S K (D))...)}. 

Note that we could be interested in computational indistinguishability between encrypted documents and 
updated ones. Let us consider a privacy property defined by the following simple game: the adversary 
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A interacts with its encryption oracle £k{-) and updating oracle X K (.,.,.) and eventually submits to the 
challenger a document D along with a modification M E M. The challenger selects a bit 6 0 e {0, 1} uni- 
formly at random. If 6 0 = 0 then the challenger returns £k(D{M)), otherwise it returns X K (M, D,£ K (D)). 
The result is given to A which may then make more oracle queries. Finally A must output a guess b for 
the value of bo. We consider that A wins if 6 0 = b and we say that the scheme is private if reasonable 
adversaries cannot win significantly more than half the time. We will denote by Expt^ riv " CPA (A, k) the 
corresponding experiment which returns 1 if b 0 — b and 0 otherwise. 

Definition 5: Let ^ = (Q,£,1,V) be an incremental encryption scheme over modification space M, 
and let A be an adversary for an attack CPA. We define the adversary advantage Adv Pr%v ' CPA by : 



Adv^ riv - CPA (A,k) 



Pr (Exp4 nv - CPA (A, k) = 1) - \ 



We say that \& is (t, q e , p, Qmo e)-secure in the sense of Priv-C P A if, for any adversary A which runs in 
time t, making q e queries to the £r{-) oracle and q { valid queries to the T K {.,.,.) oracle (with the total 
number of ciphered data in all encryption and incremental update queries equal to p), Adv^ rtv ' CPA (A, k) 
is less than e. If Adv P - nv ~ CPA (A, k) = 0 we say that ^ is perfectly private. 

The case of byte-wise incremental schemes: A simple way to construct a byte-wise incremental 
encryption scheme is to use a block-based incremental encryption scheme in which we assign only one 
byte of the document per block, but at the cost of both a large number of block-cipher evals and a very 
large ciphertext. A more interesting solution is to assign a variable number (chosen from a probability 
distribution) of contiguous bytes of the document per block in order to decrease both the computational and 
the size overheads. Concerning the privacy of modifications performed on a document, an adversary should 
not find a scenario of successive modifications which leads to a bias in the distribution of block lengths. 
Thus, an incremental update algorithm has to ensure that statistical tests will not reveal outlying regions in 
the sequence of variable-length parts and therefore the location of insertions. Secondly, concerning the size 
of the document, it has to conserve (on average) the overhead for both the ciphertext size and the number of 
operations to perform in the decryption algorithm. This should be implied by the perfect privacy property. 

Definition 6: An incremental variable-length block-based encryption scheme is perfectly private if and 
only if: (i) The distribution of blocks' lengths does not depend on information about modifications per- 
formed, that is, update operations are implemented so that this distribution is preserved; (ii) The way 
to operate these blocks during an update is itself perfectly private, that is, after a modification, the set 
of relations between these blocks is indistinguishable from the set of relations between the blocks of a 
non-updated message. 



E. Relation among these notions 

It is not difficult to see that if an incremental encryption scheme ^ is perfectly private and ensures 
update secrecy then this same scheme ensures message secrecy as well. Intuitively, if ^ is perfectly private 
an updated encryption is indistinguishable from an initial encryption and we could obtain a large-sized 
encrypted message by essentially making several updates on a initial small encrypted message 3 . In such 
a situation distinguishing the encryption of two distinct messages (of same length) is almost the same as 
distinguishing two distinct updates (of the same type and modifying the same location in the document). 
Note that this property is particularly usefull when providing proofs of security and constructing algorithms. 
Despite this, the paper stays conservative in the description of our incremental schemes by giving different 
algorithms for encryption and update and by providing proofs of security for perfect privacy, update secrecy 
and message secrecy while this latter appears to be redundant. 

Theorem 1: Let ^ = (Q, £,X, V) be an incremental encryption scheme over modification space M.. If ^ 
is (t, q e , p, qi nc , e)-secure in the sense of IM-CPA and (t 1 , q' e , p', q' inc , e')-secure in the sense of Priv-C PA 
then \1> is also (t + t', q e + q' e , p + p', q inc + q' inc , e + e')-secure in the sense of I-CPA. 



3 To be more explicit in the underlying idea, we could imagine an intial empty message. 
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Proof of theorem 1: Consider the message secrecy experiment Expt^^Q A (A, k) and change it a little 
as follows: the algorithm A 1 takes as input k and outputs the triplet (D, M 0 , Mi, s) where M 0 and M 1 are 
two modifications (of the same type and modifying the same location) such that D{M 0 ) and D(Mi) are two 
distinct documents. Note that these modifications could change D completely. A bit b 0 is chosen uniformly 
at random in {0,1} and is kept secret from A 2 . A random ciphertext C of D(M bo ) which constitutes 
the challenge ciphertext is then proposed to the algorithm A 2 . This slightly changed experiment, denoted 
Expt-b^ ( ^ p ^(A, k), can be detailed in the following way: 



Expt-b{™(A,k) 

k <-g{k) 

(D,M 0 ,M 1 ,s)^A 1 (k) 
(Mq = (substitute, i 0 , jo, Po) and 
Mi = (substitute, ii, ji, /3i) are such 
thatio = h, jo = ji and \f3 0 \ = \f3i\) 

b 0 <- {0, 1} 

C <— S K (D(M bo }) 

bi-A 2 (D,M 0 ,M 1 ,s,C) 

\ib = b 0 then return 1 else return 0 



This new experiment being equivalent to the previous one, we therefore have: 

Pr {Expt-b^% A (A, k) = 1) = Pr {Expt^% A (A, k) = l) . 

On the basis of the last defined experiment, make the following change which consists to replace the chal- 
lenge encryption £ K (D(M bo )) by the composition of an encryption followed by an update X K (M bo , D, £ K (D)). 
This new experiment, denoted Expt-b^Q PA (A, k), can be detailed in the following way: 



Expt-b I M F -g PA (A,k) 

K i- Q(k) 

(D, M 0 , Mi, s) <— Ai(k) 
(Mq = (substitute, i 0 , jo, Po) 3nd 
Mi = (substitute,^, are such 

thatio = ii, j 0 = ji and \p 0 \ = \pi\) 

b 0 ^{0,l} 

C ^X K (M bo ,D,S K (D)) 
b<-A 2 (D,M 0 ,M 1 ,s,C) 
\ib = b 0 then return 1 else return 0 



Assume that we have a distinguishing algorithm Dist which takes as input a bit a. If a = 0 Dist 
runs Expt-b^ ^^(A, k), otherwise it runs Expt-bJ^Q PA (A, k). By assumption of an incremental en- 
cryption scheme ensuring the privacy of modifications, the distinguishing advantage of Dist is at most 
Adv£ riv - CPA (k) where Adv% rtv - CPA (k) = max A {Adv Priv - CPA (A, k)}. 

We note, moreover, that in the "guess" phase of the experiment Expt-b^Q PA (A, k) the intermediate ci- 
phertext C (before update) is not known to A 2 . Consequently, we have |Pr (Expt-b I ^ A pQ PA (A, k) — l) — \ \ < 
\Pi{Expt^ M pS PA (A,k) = l)-\\. 

We then conclude from the triangle inequality that: Adv P ~ CPA (A, k) < Adv Priv ' CPA (A, k)+Adv PM -° PA (A, k) < 
e' + e. M 
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III. A SPACE-EFFICIENT BLOCK- WISE INCREMENTAL ENCRYPTION SCHEME 

In this section we describe our block- wise incremental encryption scheme that we will call swrECB, 
as the idea is to use a sliding window over the randomizers. Depending on the parametrisation used, this 
scheme can be quite space-efficient. Let us consider a pseudorandom functions family F with input-length 
and output-length both equal to 8iV. Let us also consider an instance F K of F and two fixed integers e, 
d G N* 2 such that 8iV = ed with d > 2. We assume the use of a key generation algorithm that takes as 
input a security parameter k and returns a symmetric key K. In the following subsections we describe the 
three remaining algorithms. 

A. Encryption and decryption algorithms 

The algorithm 1 describes the encryption operation. It takes as input a key K and a document of n 
blocks of size 8iV-bits. First of all, it generates uniformly at random n + d — 1 blocks of size e-bits. Then 
it applies F K on the concatenation of the first d random blocks, and XORes (Exclusive- Or) the result with 
the first plaintext block to obtain the first ciphered block. Then, it repeatedly performs the following steps: 
it considers the last d — 1 random blocks of the current window and the immediately following random 
block. F K is applied on the concatenation of these d "shifted" random blocks and the result is Xored with 
the following plaintext block to obtain the corresponding ciphered block. 



Algorithm 1 S 

Input: A blockstring D = {PiP 2 ■ ■ ■ P n }, a key K 
i: for j — 1 — >• d - 1 do 

2: Tj <- {0, l} e ; 

3: for j = 1 — > n do 

4: r d _i +i <- {0,l} e ; 

5: Cj <- F K ( rj \\r j+1 \\ . . . ||r j+d _i) © Pj\ 

6: return {{r i ) i=1 ... n+d -i ) {C i ) i= i... n ); 



The algorithm 2 describes the deterministic decryption operation. It takes as input a key K, the ciphered 
document and applies the same operations that the encryption algorithm by replacing the plaintext by the 
ciphertext. 



Algorithm 2 V 

Input: A ciphertext ((r i ) <= i... n+d _i, (Ci) i= i... n ), a key K 
i: for j = 1 — > n do 

2: Pj <r- F K ( rj \\r j+1 \\ . . . ||r j+d _!) © Cj\ 
3: return {Pi) i=1 ... n ; 



B. Incremental update algorithms 

As seen before, by sliding a window over the sequence (ri) i=1 _.. n+ d-i, we can obtain n blocks of size 
8iV bits. For convenience, we will refer to a window of index i as the block rj||rj + i|| . . . The 
update operations (algorithms 3, 4 and 5) are described on a case-by-case basis due to different (cautious) 
approaches when dealing with the involved random blocks. As we can see, when a random block is affected, 
a certain number of upstream and downstream windows need to be reevaluated and the corresponding 
plaintext blocks reciphered. Besides, for security purposes in the algorithms 4 and 5, we mandate the 
redrawing of all the random blocks in the window which serves to cipher the replaced (or inserted) block. 
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C. Efficiency analysis 

Certain assumptions, like the fact that algorithms 3, 4 and 5 describe modifications of only one block, 
have been taken for simplicity. In the followings, we discuss the efficiency of more general algorithms: 

• The algorithm 3 assumes the presence in the document of d — 1 blocks before the deleted one. If 
there is only k blocks before block m with k < d — 1, only k evaluations of F K is in fact required. 
Besides, none of the random blocks need to be regenerated. The extension of this algorithm to the 
deletion of any number q of contiguous blocks is clear. Whatever this number is, the number of 
windows affected is at most d—1. Consequently, the number of evaluations of F K stays to (at most) 
d-1. 



Algorithm 3 Deletion of a block P m 

Input: the ciphertext ((r;); = i... n+d _i, (Ci) i= i... n ), the document D, a key K 
l: for i = m — (d — 1) — > m — ldo 

2: Q F K (ri\\r i+1 \\ . . . ||r m _i||r m+ i|| . . . \\r i+d ) © P t ; 

3: return ((rj)j=i... m _i, (rj)j =m+ i... n+ d-i, (Cj)j = i... m _<2, 

ifli)i=m— (d— l)...m— !■> (Cj)i=ra+l...ri) i 



• In the algorithm 4 the d random blocks contained in the window of index m need to be regenerated. 
The algorithm described assumes the presence in the document of d — 1 blocks before and after 
block m. If there is only k blocks before block m and only / blocks after it with k < d — 1 and 
I < d — 1, only k + 1 + 1 evaluations of Fk are in fact required. The extension of this algorithm to 
the replacement of any number q of contiguous blocks implies the regeneration of at most q + d—1 
random blocks, affecting at most q + 2(d — 1) windows. We deduct a number of evaluations of F K 
of at most q + 2(d — 1). 



Algorithm 4 Replacement of a block P m by P' m 

Input: the ciphertext ((rj) i=1 . n+d _ 1? (Cj)j = i...„), the document D, a key K 
l: for j = m — > m + d — 1 do 

2: r) 4- {0, l} e ; 

3: for j = m — (d — 1) — > m — 1 do 

4: q <- F K { rj \\ . . . ||r m _i||r^|| . . . \\r , j+d _ 1 ) © Pf, 

5: C' m <- F K (r' m \\r' m+1 \\ . . . Hr^^J © P' m ; 

6: for j — m+ 1 — > m + d— 1 do 

7: q <- Fa-^-H . . . IK+^Jrw^ll . . . ||r j+( i_i) © P,; 

8: return ((r i ) i=1 .„ m - 1 ,(r' i ) i=m ... m+d - 1 , 

yi)i=m+d...n+d—li (Ci)i=l...m— 1) 
(Cj)i=m-(d-l)...m+d-lj (Cj) j=m+d...n) ! 

• In the algorithm 5 the d — 1 random blocks that follow r m are regenerated and a new random block 
of size e-bits is inserted between r m and r m+1 . Always with a view to simplification, the algorithm 
assumes the presence in the document of d — 2 blocks before the block P m and d—1 blocks after. 
Taking into account this insertion, a new window is appearing and 2d — 1 windows are affected so 
that at most 2d evaluations of F K are required. The extension of this algorithm to the insertion of 
any number q of contiguous blocks implies the generation of q random blocks, bringing the number 
of evaluations of F k to at most q + 2(d — 1). 
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Algorithm 5 Insertion of a block P' right after P m 

Input: the ciphertext ((r i ) i=1 ... n+d _ 1 , (Cj) i= i... n ), the document D, a key K 
l: r' <- {0, 1}': 

2: for j = m + 1 — > m + d — 1 do 

3: r'j <- {0, l} e ; 

4: C^__( d _2) <~ FK(r m -(d-2)\\ ■ ■ ■ \\r m \\r') © P m -{d-2)\ 

5: for j = m — (d — 3) — > m do 

6: q <- F x (r,|| . . . \\r m \\r'\\r' m+l \\ . . . \\r' j+d _ 2 ) © Pf, 

7: C'^F K (r'\\r' m+1 \\...\\r' m+d _ 1 )®P'; 

8: for j = m + 1 — > m + d — ldo 

9: q <- Fx-^-H . . . llr^+^ill^+dH . . . ||r j+d _i) © Pj; 

10: return ((r;) i= i... m _i, /, (r-) i=m+ i... m+d _i, 

yi)i=m+d...n+d— 1 j (Ci)i=l...m— (a!— 1) j 
(C ? ')i=m-(d-2)...m) C j (Cj)i=m+l...m+(i-li 
(Cj) i=m+d...n) i 



The space occupied by the random blocks (rj)i=i... ri+ d_i is exactly (ne + (d — l)e) bits. This is to be 
compared with the overhead of 8nN bits consumed by rECB. In other words, this means that the expansion 
of a ciphertext produced by swrECB is about 1 + ^ whereas it is exactly a factor 2 when using rECB. The 
ciphering of a n-block document requires n evaluations of F K . Finally, the number of evaluations of F K 
for update operations is summarised in the following table: 



Deletion of q 
block 


Replacement of 
q block 


Insertion of q 
block 


d- 1 


g + 2(d-l) 


q + 2(d- 1) 



IV. A SPACE-EFFICIENT BYTE-WISE INCREMENTAL ENCRYPTION SCHEME 

In this Section, we describe our space-efficient, byte-wise incremental and perfectly private incremental 
encryption scheme. Unlike the previous scheme, this one relies on a stateless mode of encryption, that is to 
say, a mode in which we do not maintain a state when ciphering the successive messages (the initialization 



D 


the document is now seen as a bytestring 


0 


bytestring to insert 


D[a,b] 


substring of D from the byte a up to the byte b (included) 


Ui <- {1, . . . ,L} 


m is drawn from the set {1, . . . , L} according to <fi 


B t 


i-th variable length block of size u. 


U = ("z)i=l..n 


list/sequence of variable lengths Uj 


D = (Bi)i=l..n 


partitioned form of the bytestring D (sequence of parts Bi) 


l-l 


size of a data in bytes or number of elements in a sequence 


M 


number of parts in D 


W 


size of the bytestring (3, in number of bytes 


k 0 


index of a part after which the repartition starts 


ki 


index of a part part before which the repartition terminates 




sentinel lengths used by convention, and valued at 0 


Co, C| u | +1 


sentinel ciphered parts used by convention, whose contents 


do not matter 


ro, r\ u \ + i 


sentinel random vectors used by convention, whose values do 




not matter 



TABLE II: Notations and conventions for mcXOR 
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vector is chosen at random each time). As previously, we assume the use of a key generation algorithm 
that takes as input a security parameter k and returns a symmetric key K. In the following subsections we 
describe the three remaining algorithms, namely, encryption, decryption and update algoritms. 

A. Notations 

The main notations used are described in Table II. The use of sentinel indices are necessary in the case 
of modification at the very beginning or the end of the document. For example, if we consider a n-byte 
document, they will allow us to consider modifications of type (substitute, (5) with i = 0 or j = n+1. 

B. Encryption and decryption algorithms 

The bytes string D is divided into variable sized blocks (Bi) i=1 _, n whose corresponding lengths (tii)i=i.. 7l -i 
(except the last term) follow a discrete probability distribution 0 on a set {1, . . . , L}. Obviously, the size S 
of D being fixed, an initially empty partition B is in practice built by repeating the following operations: 
(i) Draw a number a from 0([1, L]) and set S = S — a; (ii) If S > 0 insert into B the part containing the 
next a bytes. Otherwise, terminate by inserting the part of the remaining S + a bytes. 

Stateless modes of encryption described in [11], such as OFB, CBC or XOR can be applied on variable 
numbers of contiguous bytes of D so that a randomizer, used as an initial vector for the considered mode, 
does not serve for one block but for several. The process is as follows. We partition the document into 
several groups of variable number of contiguous bytes, the number being choosen according to a discrete 
probability distribution (parameterized by a multiple IN of the block size, for instance U([l, IN])). Then we 
cipher independently each group with a stateless mode of operation from [11], [12] as if they were different 
messages. Modes of operation that behave like a synchronous stream cipher (see stream cipher modes such 
as OFB or XOR [11]), for which the plaintext is masked with a generated keystream, are preferable for 
the reduction of the ciphertext expansion. We give an example of algorithm using the "stream cipher like" 
mode XOR (stateless version of CTR, sometimes called randomized CTR) instantiated with a block cipher, 
a good choice to keep a high degree of parallelism. 

Let suppose a function XOR.E K which implements the encryption operation of XOR, takes as input a 
symmetric key K, a message M and returns the ciphered message C of the same size together with the 
associated random initialization vector IV. We do not recall the description of this well known algorithm 
[12] and we assume that the underlying pseudorandom functions family used is F. The encryption operation, 
denoted mcXOR (multiple calls to the XOR mode) is described in Algorithm 6. 



Algorithm 6 £ 

Input: A bytestring D = {bib 2 . . . b n }, a key K 

l: n <- \D\; e <- 1; k <- 1; j <- 1; 

2: while e ^ 0 do 

3: Uj {1,..., IN}; 

4: if n - (k + uj) > 0 then 

5: (rj, Cj) <- XOR.E K (D[k, k + Uj - 1]); 

6: k <r- k + Uj] j ^— j + 1; 

7: else 

8: { T j, Cj) <- XOR.E K {D[k, n}); 

9: Uj «— n — k + 1; e «— 0; 

10: C^(7i||C 2 ||...||C j ; 

ii: return ((r i )i = i... j ,(u i )i=i...j,C); 



If we denote j 0 the value of j at the termination of the algorithm, the resulting ciphertext is composed of 
the sequences of random vectors (rj)j=o...j 0 , blocks lengths (uj)j =0 ...j 0 and ciphered bytes C. Let suppose 
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a function XOR.D K j V which implements the decryption operation of XOR, takes as input a symmetric 
key K, a ciphertext C together with the associated random IV and returns the corresponding plaintext D 
of the same size. The deterministic decryption is described in Algorithm 7. 



Algorithm 7 V 

Input: A ciphertext ((r;) i=1 ... io , (ui) i=1 ... jo , C), a key K 
l: k <- 1; 

2: for i = 1 — )■ j 0 do 

3: Bi XOR.D Ktn (C[k, k + uj- 1]); fc <- fc + 

4: £)^-Si||S 2 ||...||S J - 0 ; 
5: return D 



C. Incremental update algorithm 

We recall that a random step in a walk corresponds to a random draw «j ~ 0. Let us assume that we have 
to insert a bytestring (3 somewhere between the first byte of B k _i and the first byte of B k in the sequence 
of variable length blocks B 1: . . . , B n . The first approach which simply consists to partition (5 in the same 
manner as we partition the document and insert (5 at the good location in D (possibly decomposing B k -i in 
two parts) is not sufficient. Indeed, this method leads to a bias in the resulting distribution of blocks' lengths 
after multiple insertions. For the same reasons, any other method which focus strictly on the partitioning of 
j3 is a losing proposition. The approach defined in [9] proposes to realise a synchronization of random walks 
so that we do not distub the probability distribution of blocks' lengths (u,j), leading to the ensurance of the 
perfect privacy property and the conservation of the average space and time overheads. Applying this method 
allows to perform a modification while respecting the lengths distribution but leads almost systematically to 
a repartitioning of an untouched but quite limited subpart of the document. Let the corresponding sequence 
of length (uj)j=i..n be drawn i.i.d. (independently and identically distributed) from a discrete distribution. 
The problem is to generate a sequence of i.i.d. draws (w-)j=fc../ from this distribution until we find a couple 
of indices (7, to) satisfying the following equality 

/ m 
j=k j=k 

where A — \(3\ + | | if the insertion if performed between two bytes of B k _i and A — \/3\ otherwise. 
For instance, in this second case, the resulting sequence of variable sized data blocks is then B 1 , . . . , 
B k -i, B' k , . . . , B[, B m+ i, . . . , B n where the subsequence of blocks (-B-) i=fc ..; (of respective lengths (w-)i=fc.i) 
which contains the inserted data replaces the subsequence {Bi)i= k .. m - 

All the operations described (insertion, deletion and replacement) can be preformed thanks to a single 
one, a substitute function. The resulting scheme is secure provided that update operations are done by 
paying particular attention to the variable-length blocks that are changed (in content or length). Indeed, 
the associated random values need to be regenerated and these blocks reciphered. This function, described 
in Algorithm 8, takes as input an encrypted form ((rj)j = i...j 0 , (wj)j=i...j 0 , (Cj)j = i...j 0 ) augmented by sentinel 
values for the sake of algorithmic simplification, as they allow update modifications at the very beginning 
or the end of the document. 
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Algorithm 8 X 
Input: 

A ciphertext ((ri) i=0 „. jo+1 , (ui) i=0 ... jo+1 , (Ci) i=0 ... jo+1 ), 
the document D, an operation M = (substitute^, j, (5), a key K 

i: v <- 1; c <- \/3\; 

2: k 0 <- argmin a (f(a) = Em=o M ™l/( a ) ^ *); 

3: if f(k 0 ) > i then 

4: f3^ D [f(k 0 -l) + l,i]\\/3; 

5: c <- c +\D[f(ko-i) + l,i\; 

6: fco ■<— /Co — I? 

7: fcx <- argmin a (/(a) = El=o u ml/( a ) > J - ^ 

8: if /(ifci) > j - 1 then 

9: 0^/?||I>|)\/(*i)]; 

10: c«-c+|£>[;,/(A;i)]|; 

ii: fci <- fci + 1; 

12: x £ {1,...,ZJV}; 

13: if x = c then 

14: «— x; 

15: (r' v ,C' v )^XOR.E K ((3); 
16: -u^-u + l; 
17: go to step 34; 

18: if x < c then 

19: c <— c — rr; «— x; 

20: (r;,C;)^XO^[l,x]); 

21: 13 <- (3[x+ l,c]; 

22: f ^— f + 1; 

23: go to step 12; 

24: else[a; > c] 

25: if fci = j 0 + 1 then 

26: u' v <— c; 

27: (r;,c;)^xoi?.^) ; 

28: V <r- V + 1; 

29: go to step 34; 

30: c ^— c + -u fel ; 

31: p^pWDifih-^ + lJih)}; 

32: fci fci + 1; 

33: go to step 13; 

34: r <- ((ri) i=1 _.. ko , (rQ^i...^!, (r;) i=fcl ... J0 ); 

35: U <- ((Ui)i=i...fc 0 , (M-) i= i. (tii)i=fc 1 ...j 0 ); 

36: C <— ((Cj)j = i...fc 0 , (CjO^l-f-l) (C»)»=fci-jo) 

37: return (r, u, C); 



D. Efficiency analysis 

Encryption efficiency: From now on we will consider a 16-byte block cipher. We estimate the average 
number of block cipher evals required to encrypt a document as follows: Given that a part size X follows 
a uniform distribution U([l, L]) where L is a multiple of the block-size, according to the Wald's equation 
[13] the average number of parts in a ciphertext is tightly upper bounded by ffi^f . The number n x of 
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blocks in a part follows a distribution U([l, L/16]). Subsequently, the Wald's equation allows us to upper 
bound the average number of block cipher evals by the product • 

Update efficiency: More precisely, let (Xj)j>i and (l^)i>i be the sequences of independent, identically 
distributed and strictly positive random variables with common distribution <fi. We set the random walks 
S n , T m such that S n = Y^=i and T m = Y1T=\ ^ an( ^ me random subset Z of N 2 

Z = {(n,m) G N 2 ;S n -T m = C ;Ce N}. 

We denote by C the number of contiguous bytes to insert. If (n, to) and (n', to') are distinct in Z 
then either n < n' and m < to' or n! < n and to' < to. In other words n < n' and m' < m leads to 
(n, to) = (n', to') . Indeed, if n < n' and to' < to then XT=i = ^ + C an d Z)i*=i = Z)£Li ^ + C 
imply ^" =n+1 ^j = — YlT= m '+i^i- Therefore Z has the form {(n k ,m k );k £ N*} and the sequences (n k ) 
and (to/;) are strictly increasing. Consider the algorithm described below that takes as input an initial value 
Co for C. First, we notice that we have two ways to terminate the algorithm, the traces of execution (5,1,2) 
or (9,2). We can reason about both the number n\ of draws X and the number m 1 of draws Y. 



i: X£{1,...,L}; 

2: If X = C stop; 

3: if X < C then 
4: C^C-X; 
5: go to step 1; 

6: if X > C then 

7: Y£{1,...,L}; 
8: C^C + Y; 
9: go to step 2; 



We notice also the followings: 

• If C 0 > L, the average number of consecutives executions of the step 3 is upper bounded by C 0 /E(X), 
after which we have C < L. 

• If Co < L, whenever step 1 (or 7) is executed we have C < L and then a non-zero probability 
P(x = C) (or P(y = X — C) respectively) to terminate. Let us assume that 0 is a uniform law and 
let us denote d\ the total number of random draws (d\ — n\ + mi). It turns out that d\ follows a 
geometric law of parameter therefore we have the system of equations: 

E(ni) + E(toi) = L 

2L 

E(m) - E(mi) < — 

so that E(ni) < l + L/2. Similar but pessimistic upper bounds are possible for a binomial distribution 
B(L — l,p) + 1 or a geometric distribution Q(p) with p a parameter to be set. 
In more general terms, assuming a uniform distribution and C 0 > L, let d 0 denote the number of random 
draws needed to satisfy the predicate C < L for the first time and di the number of random draws needed to 
terminate the algorithm since the first satisfaction of this predicate. So, d = do + d\ and we have the upper 
bound E(m) < + f + L Tj ^ s 

upper bound corresponds to the average number of parts to (re-)cipher 
when the insertion is performed between two parts. When the insertion is performed inside a part, this part 
has to be included in the partial repartition. Given that a part size is at most L, the upper bound becomes: 
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This latter constitutes a tight worst case upper bound for the average number of changes in the partition. 
Continuing, if we denote njj the number of blocks to cipher during an update, we can give the following 
upper bound: 

n-/ N n-/ vr/ X ( 2 C 0 L \ 16 + L 

EKO = E(m)E(n x ) < ' 



1 + L 



+ 77 + 3 



32 



Theorem 2: Assuming a 16-byte block cipher, a uniform distribution on the set {1, . . . , L} where L is a 
multiple of the block-size and an operation M = (substitute^, j, 0), the average number of block-cipher 

evals, when performing an update, is upper bounded by + § + 3 j 

If we consider the case of a non-uniform distribution <j> £ T* with mean /i, by denoting p min = 

min Mi) we can show in the same way that E(ni) is upper bounded by Co/fi + 2/p min + 3L//x. 
ie{i,...,L} 

This constitutes a loose bound. In the rest of the paper we consider the use of a uniform distribution. 

Storage space: The average storage space required for the sequence u depends on the choice of the 
distribution and is upper bounded by l~^f^l ■ That re P re sents, even for a basic distribution such as 

U([l, Lj), a small storage overhead. The average number n r of random elements drawn from 0 can be 
statistically parameterized such that n r < Y(xJ- As a result, the average total size (in bytes) of the encrypted 

message can be upper bounded by \D\ + l -^ x j (N + where X ~ U([l, L\). 

Considering that n is the document size in bytes and L a multiple of the block-size such that L = IN, in 
the following Table III we present the storage space and the running time for an encryption corresponding 
with N = 16 and various values for / (note that we have rounded the factor of n and the constant to three 
significant digits and two significant digits respectively). 



Mode 


Distribution 


Encryption or 
decryption 


Synchronization 


Average size 




W([l, 128]) 


< 0.078n + 10 


< 302 


< 1.264n + 34 


mcXOR 


W([l, 256]) 


< 0.071n + 18 


< 1114 


< 1.133n + 34 




«([1, 512]) 


< 0.067n + 34 


< 4274 


< 1.071ra + 36 


rECB 


constant 


0.063n 


block-wise 


2n 



TABLE III: Efficiency of mcXOR where the encryption/decryption and the random walks synchronization are expressed in 
average number of block-cipher evals. Note further that n is the document size in bytes. 



As a result, our scheme is able to reduce drastically the storage space overhead consumed by ciphertexts. 
The synchronization complexity corresponds to the contribution (f + 3) and we observe that the 
more we want to decrease the storage overhead, the longer is the delay for syncronizing the random 
walks. For instance, if the distribution used is U([l, 512]), the expansion of the ciphertext is slight but the 
synchronization needs to recipher approximately 70KB of data. Thus, such a high value for L is justified 
for large sized documents. 

How to index efficiently: When the document is large and a user accesses only a portion of a ciphered 
document, one might wonder how to index efficiently the ciphered document, or in other words, how to 
evaluate efficiently a sum ^ =0 u m . A solution is to use an history independent data structure, such as the 
Oblivious Tree [10]. The use of such an oblivious data structure is important to keep the perfect privacy 
property. Since the interest of self-balancing data structures is well known, we describe quickly its use as 
following. The sequence of lengths (iij)j=i...M are assigned to each leaf of the tree and the value assigned 
to a parent node is computed as the sum of the values of its childrens. Then an access to a byte index, an 
update or an adding of a length value are all done in 0(\og \u\) add operations. 

V. A NEW SCHEME COMBINING THE TWO APPROACHES 

We have described two ways to improve the space efficiency: the first one consists in using a sliding 
window over a sequence of small randomizers, and then to apply a block cipher algorithm to each successive 
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window to obtain a keystream; the second one consists in applying several times a randomized mode 
of operation to each sufficiently large part of a document so as to reduce the number of random input 
vectors. To provide a provable perfect privacy property at each update in this latter scheme we employ a 
synchronization of random walks (this remark could be true for the former if we extend it in a byte-wise 
incremental version in a black-box manner). This synchronization produces a ciphering overhead which 
increases with the average size of a part and any attempt to reduce this effect is welcome. For this purpose, 
it is not difficult to imagine a byte-wise incremental encryption scheme combining these two approaches: 
we partition the document in variable-sized parts and cipher each part with a slightly different version of 
XOR mode in which an initialization vector is no longer generated uniformly at random but corresponds 
to a window sliding over a sequence of randomizers. Such an operating mode, denoted swrXOR, is then 
parameterized with two parameters, the parts' lengths probability distribution and the randomizer size e. 
If the update algorithms are deducible from our previous schemes, it is worth stressing on the need for 
caution in the regeneration of the randomizers that should be involved in an update: 

• The repartition of the document including the modification is composed of parts containing this 
modification but also of some other ones induced by the synchronization of random walks. Let us 
assume that this latter unchanged content corresponds to the parts Bi, B i+i , . . . , B i+n -i in the original 
partition. Let us denote by m the overall number of repartitioned parts and by B[, B' i+l) . . . , ^ +m-1 
the corresponding changed parts in the updated partition. 

• In addition to these m parts, this variant of XOR has to be applyed to 2(d— 1) adjacent parts. Indeed, 
for similar reasons as for swrECB, as regards the last part, all the randomizers of the associated 
input window need to be regenerated uniformly at random, implying the (re-)generation of a total 
of m + d — 1 randomizers, with the objective to conserve update indistinguishability. As a result, 
both the sequence of unchanged parts S i _ (d _ 2 ), . . . , B^ x and B i+n , B i+n+1 , B i+n+d _ 2 
need to be reciphered to maintain consistency with the sliding windows. Security and run times are 
discussed in Section VI and VII respectively. 

VI. Security analysis 

In the following subsections, we are interested in the security of our schemes concerning the message and 
update secrecy and finally the perfect privacy property. Note that proofs of indistinguishability are given 
assuming the use of a random function. They can be derived in a straightforward way in the pseudorandom 
function model, we refer to [12] to observe a reduction of security between a mode of operation and the 
underlying pseudorandom function used. These proofs being exactly the same, we do not present them in 
this paper. 

A. swrECB 

The perfect privacy of this scheme is obvious and we focus on the message and update secrecy properties. 
We prove that swrECB is still secure despite the use of incremental operations and the rationale for 
regenerating at random all the blocks contained in the window associated with the inserted (or replaced) 
block will become clear. 

Theorem 3: Let the size / be in number of blocks and suppose the use of a random function instead 
of F K . swrECB over modification space Mb is a (t, q e , qinc, /A I, e)-I-CPA-secure incremental encryption 
scheme in the find-then-guess sense where: 

21 n* + 1(1 - 1) 
e — 2 m+1 

and ji* is the greatest amount of ciphered blocks that the adversary can obtain during the queries phase, 
when performing only replace or insert operations, that is to say, // = YH=i n i + Qinc(2d — 1) where is 
the block-length of the z-th document queried to the encryption oracle. 
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Theorem 4: Suppose the use of a random function instead of F K . swrECB over modification space Mb 
is a (t, q e , q inC) //*, e)-IM-CPA-secure incremental encryption scheme in the find-then- guess sense where: 

/i* + 2(rf-l) 

and // is the greatest amount of ciphered blocks that the adversary can obtain, as stated in the previous 
theorem. 



B. mcXOR 

We assume the use of a distribution <f> — U([l, L\) where L is a multiple of the block-size and the proof 
of security in appendix considers a particular stateless mode of encryption (XOR). In fact, we can generalize 
by replacing XOR by any stateless mode of encryption, the proof remains similar. As for swrECB we prove 
that mcXOR is still secure despite the use of incremental operations. 

Theorem 5: Let the size / be in number of bytes and suppose the use of a random function instead of F K . 
mcXOR over modification space Mb is a (t, q e , qi nc , n\, /, e)-I-CPA-secure incremental encryption scheme 
in the find-then- guess sense where: 

1 ( I 2 . . . A 1 



e < — 8 T + 40Z + 18L + + L) 



N \ L 7 2 8N 

and ia\ is the number of (re-)partitioned ciphered parts that the adversary can obtain during queries, that is 
to say: _ _ 

A = 2 (££i^ + p^i<> + 2q e + (L/2 + 3)q mc 

where n\ is the byte-length of the i-th document queried to the encryption oracle and n l u is the byte-length 
of the i-th modification queried to the updating oracle. 

Theorem 6: Suppose the use of a random function instead of F K . mcXOR over modification space Mb is 
a (£, q e , q inc , \x\, n uc , e)-IM-CPA-secure incremental encryption scheme in the find-then-guess sense where: 

1 / n 2 \ 1 

e < ^ i 8-^ + 40n uc + 18L + 4^ 2 (n uc + L)\ ^ 

with ji* 2 the amount of (re-)partitioned ciphered parts that the adversary can obtain during the queries phase, 
augmented by the repartioned parts (of unmodified content) obtained during the challenge update, that is 
to say H2 — lA. + L/2 + 3, and n uc the byte-length of the data to insert in the challenge update. 

Theorem 7: For all document D = bi . . . b n where (bi) i= i.. n are the ordered bytes of D and for all 
(3 (including the empty string), considering the encrypted document C = mcXOR.£ K (D), the outputs 
of mcXOR.£ft:(&i&2 • • • h-ihPbjbj+i . . . fe n _ife n ) and mcXOR.l K ((substitute,i,j,/3),D,C) are perfectly 
indistinguishable. 



C. swrXOR 

We take the same assumptions about the distribution of parts' lengths. The proof of security in appendix 
describes only the differences when compared to the proofs supplied for mcXOR. It turns out that the upper 
bounds on insecurity of swrXOR stay close to this latter up to a constant factor. 

Theorem 8: Let the size I be in number of bytes and suppose the use of a random function instead of F K . 
swrXOR over modification space Mb is a (t, q e , qinc, A**, I, e)-I-CPA-secure incremental encryption scheme 
in the find-then-guess sense where: 

e - ? ~W (4 + 40/ + 181 + 4/i * (/ + L) ) ¥^ 
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and //* is the number of (re-)partitioned ciphered parts that the adversary can obtain during queries, that is 
to say: 

n\ = ^ l=1 + 2qe + {L/2 + 3 + 2{d " 1))qmc 

where n\ is the byte-length of the i-th document queried to the encryption oracle and n l u is the byte-length 
of the i-th modification queried to the updating oracle. 

Theorem 9: Suppose the use of a random function instead of F K . swrXOR over modification space Mb 
is a (t, q e) q inC) n uc , e)-IM-CPA-secure incremental encryption scheme in the find-then-guess sense where: 

7L { n 2 \ 1 

e < ^7 8^ + 40n uc + 18L + A/i* 2 (n uc + L) 



with ^2 the amount of (re-)partitioned ciphered parts that the adversary can obtain during the queries phase, 
augmented by the repartioned parts (of unmodified content) obtained during the challenge update, that is 
to say fi* 2 = + L/2 + 3 + 2(d— 1), and n uc the byte-length of the data to insert in the challenge update. 

Theorem 10: For all document D = b±. . .b n where (bi) i= i_. n are the ordered bytes of D and for all 
f3 (including the empty string), considering the encrypted document C = swrXOR.£ K (D), the outputs 
of swrXOR.^(6i6 2 . . . bi_ibi(3bjbj + i . . . b n _ib n ) and swrXOR.X^ ((substitute, D, C) are perfectly 

indistinguishable. 



VII. Summary of results 

For the extension of swrECB into a byte-wise incremental scheme, we assume the use of a slightly 
different approach of [9] in which each variable-length part is ciphered thanks to the lowest possible 
number of block cipher calls. For instance, if a part is of length p bytes, only \p/N~\ block-cipher evals is 
needed. Besides, as the construction of [9] was very generic, we discard the padding of the variable-length 
parts and encrypt them so that their lengths remain the sames 4 . Table IV summarizes the efficiencies of 
our schemes. We then notice that when we choose a very high parameter L for mcXOR the delay for 
the synchronization of the random walks becomes prohibitive, and we observe in this case that swrECB 
becomes more attractive with respect to the update operations. Indeed we can choose a smaller parameter 
L* for the random walk used in the extended swrECB along with an appropriate choice for e so that the 
space consumption is equivalent to what we observe with mcXOR parametrised with L. This observation 
led us to consider the third scheme swrXOR. From a space/update time tradeoff perspective, by using just 
a little bit more than e bits of overhead by ciphered part, swrXOR is the most interesting. If we select 
correctly the involved parameters, combining the approach of the sliding window with mcXOR allows to 
significantly reduce the expansion of the ciphertext while avoiding a too large increase of the constant in the 
update average runtime (the contribution of the random walks synchronization). By varying the parameters 
of our encryption schemes, the following observations can be made about the coefficients in the upper 
bounds linear equations (Figure 2 depicts the lines for various parameters): 

• If we increase the parameter L, or more generally, if we increase the average size of a part then the 
expansion of the ciphertext is reduced. 

• By increasing the average size of a part we increase the value of the constant in the update average 
run time upper bound, making incremental updates of the ciphertext less efficient for small changes 
in the corresponding document. On the other hand, this is accompagned of a slight decrease of the 
line slope, making incremental updates more efficient for big changes. 

• Using swrXOR instead of mcXOR with a smaller parameter L and an appropriate parameter e allows 
to obtain a reduction of both the ciphertext expansion and the constant part in the update average run 
time. In Figure 2, we can observe the behaviours of mcXOR with L = 512 and swrXOR with L = 128 



SwrECB is a XORing mask mode, so we can cipher each part of length s-bit with the correspondings s highest (or lowest, depending on 
the used convention) significant bits and discard the unused ones. 
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Mode 


Encryption/decryption - Block-cipher evals in 
average 


Insertion of /3 - Block-cipher evals in 
average 


Average size 


mcXOR 


(n + 


log I, \ 
8 I 




2(|D[+L) (16+L) 
(L+l) 32 


extended 
swrECB 


mi 1 2(|D|+L) fe(16+L) . HogLlN . e(d-l) 
(i+1) V 256 1 8 J 1 8 


(^ + l+3)^+2(rf-l) 


swrXOR 


ipl , 2(|D|+L) / e , log I, 
l-^l + (L + l) V 8 8 




(m + ! +3 + 2 ( d -i))i^ 



TABLE IV: Summary of tight upper bounds for the space and time efficiencies, by assuming the use of a uniform distribution 
W([l, £]) where L is a multiple of the block-cipher size. The space consumption is in number of bytes. 



and e = 8. From a space consumption standpoint, swrXOR is the more advantageous. As regards 
the update run times, swrXOR is more advantageous for insertion of less than 200KBytes. Finally, 
when performing a small change at a given location we would like to avoid reciphering completely 
the following bytes of the document (as frequently as possible). In this sense we emphasise that a 
smaller constant in the update average run time is preferable for small document. 



mcXOR with L = 512 

mcXOR with L = 128 
— swrXOR with L = 64 and d = 16 



•10 s 

2 F ' 




0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 
document size in number of bytes 40 5 



mcXOR with L = 512 

mcXOR with L = 128 
— swrXOR with L = 64 and d = 16 



■10 4 




0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 



number of contiguous bytes to insert -10 5 



Fig. 2: Upper bounds for the average space consumptions and average update run times 



Remarks about composition paradigms: Incremental authenticated encryption schemes can be ob- 
tained by the encrypt-then-MAC composition paradigm [14]: (J) The document is ciphered with an incre- 
mental encryption scheme (instantiated with a key k\); (ii) The incremental MAC (message Authentication 
Code) is computed over the resulting ciphertext (instantiated with a key k 2 ). This composition is the most 
secure, if the symmetric encryption (instantiated with a key ki) is IND-CPA and if the symmetric signature 
scheme (instantiated with a key k 2 ^k{) is INT-CTXT then the resulting authenticated encryption scheme 
is IND-CCA. Using one of our algorithm in such a composition scheme leads not only to very substantial 
savings on the space consumption but also sometimes to interesting gain in computational efficiency. Take, 
for example, a variant of mcXOR based on the stateless variant of CBC, say mcCBC. We design a byte- 
wise incremental authenticated encryption scheme in the following way: (%) Choose a distribution for the 
random walk that well reduces the expansion of the ciphertext produced by mcCBC, for instance, choose a 
distribution U([l, 512]), and encrypt the document; (ii) Apply XOR-MAC [2] on the resulting ciphertext. 
Even though XOR-MAC is a block-wise incremental MAC, the resulting composed scheme still allows 
update operations whose run time is affine (in average) in the amount of data changed. This is due to the 
fact that mcCBC respects the alignement of the unchanged ciphered blocks. Finally the consumed space 
is two times the expansion of the ciphertext. Consequently, the overall consumed space by the resulting 
authenticated encryption is little more than two times the size of the document, which is much better than 
the factor 4 implied by the composition of rECB with the incremental XOR-MAC [8]. 
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VIII. Applications and conclusion 

One can deduce a variant of the protocol of [15] which does make use of incremental encryption and 
incremental MAC to update at low costs the cryptographic forms of outsourced documents. For instance, one 
could choose a solution in which a user U can update documents stored on a server S while their contents 
are kept secret from this latter but are authenticated by the two entities. Considering that U uses a secret 
key ki to encrypt documents and a key k 2 (shared with S) to authenticate them, such a solution is obviously 
possible with the encrypt-then-MAC composition: U updates himself (herself) a ciphered document on his 
(her) local workspace with his (her) key k\ and sends the changed ciphered parts to S, which in turn updates 
the corresponding MAC with the shared key k 2 . 

Many modern applications have to deal with the handling of changes in large secure electronic documents 
on remote servers for which the development of incremental cryptographic algorithms is required. Indeed 
this is mandatory in order to reduce delays to maintain consistency. Block-wise incremental algorithms 
defined so far have the advantage of being parallelisable and already allow interesting operations such 
as block deletion/insertion. Unfortunately, their extension towards byte-wise incremental schemes [9] does 
not solve the ciphertext expansion problem. In this paper, we propose new byte-wise incremental and 
perfectly private encryption schemes that can be parametrised in order to produce smaller ciphertexts. Such 
a characteristic is as critical as computational resources are in cloud storage. 

A first approach is to design a space-efficient block-wise encryption scheme and extend it into a byte- 
wise incremental one. Its security is proved assuming a good pseudorandom function. A second approach 
is to use a stateless encryption mode as a fine-grained primitive. In this latter case, its security relies only 
on the correct use of this primitive. The run time of the incremental update operations is very efficient 
but the contribution of the random walks synchronization increases with the parameters of the distribution 
and thus lessens its interest for small documents and updates. The question then arises as to how decrease 
this synchronization overhead. A possible solution, our third approach denoted swrXOR, is a variant of the 
mcXOR mode in which the input random blocks are replaced by sliding windows over a sequence of small 
random blocks. In other words this means combining the advantages of mcXOR and swrECB. In such a 
solution, by choosing a small value for L and the appropriate value for e for approaching the consumed 
space of mcXOR, we can take advantage of the same space-efficiency while lowering the delay for the 
random walks synchronization. Finally, it would be worthwhile to estimate performances and security of 
these schemes by considering the use of a real synchronous stream cipher [16], [17]. 
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Appendix 

A. swrECB 

Proof of theorem 3: Let Gi be the multiset of windows involved during encryption and updating 
oracle accesses and let G 2 be the multiset of the I windows involved during the encryption of the challenge. 
Let D x be the event that the windows of G 2 are distinct from the windows of Gi, and let D 2 be the event 
that the windows of G 2 are all distinct. For simplicity, we also define S to be the event of success of A in 
the game Expt^ ( g^(A, k). 

Let us now consider the event D = Di n D 2 . When the event D occurs, the challenge ciphertext is 
obtained by evaluating I times the random function to distinct and new points, new in the sense that 
they have not appeared during queries. Thus, the corresponding outputs of the random function are I new 
random draws from {0, 1} 8N . By XORing such a random value with a plaintext block of the challenge, 
we obtain a ciphered block which is still a new random draw. The I ciphered blocks of the challenge are 
then independently and identically distributed over {0, 1} 8N . Besides, this distribution is independent of the 
previous ciphered blocks obtained during queries. Consequently, the adversary can at best make a random 
guess. We have Pr(S | D) = 1/2 and it follows that 



|Pr(S) - 1/2| < 



Pr(S | D) - \ 



Pr(D) < -Pr(D). 



Before continuing, consider the windows [wi)i=\... n in the same ciphertext, the probability that Wi = w i+ k 
for all i, k > 0 is exactly l/2 ed . In other words, the fact that two windows within a same ciphertext share 
a common (but shifted) sequence of randomizers or not, the probability that they collide is still the same. 
Now it remains to estimate the probability of occurrence of the event D: 

• Let us denote (wf) i= i...i the windows used in the challenge ciphertext. Let us denote (wf) i= i_.. ni , 
{wf')i = i... n2 , (Wi qe ) i= i...n qe the windows used during encryption queries and (w" 1 )i=i... 2( i-i, 
(u>" 2 )j = i... 2( f_i, (u>" 9inc )i = i... 2( 2_i the windows used during update queries. We assume update 
queries of type replacement or insertion (only) since they bring more information to the adversary. 

For i G [1, I}, j G [1, q e j and k G [1, nj let us denote the event dT = {w? = w e k j } and for i G [1, /], 
j £ [1> Vine} and k G [1, 2d — 1] let us denote the event d l £ k = {w^ = w£}. The event D x is included 



in U U U d * jk U U d ^ k ■ We therefore have 

i=l...l \j=l...q e k=l...rij j=l...q inc k=1...2d-l J 

Pr(D!) < ^ 

where ft* = YH=i n i + Qinc(2d — !)• The contribution q inc (2d — 1) corresponds to the most favourable 
update queries for the adversary (insertion or replacement only). 
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• We define the event d l 2 J = {w^ = Wj} for all i, j G [1, /] with i ^ j. The event D 2 being included in 
U U ^2 ' we can subsequently upper bound its probability of occurence as follows: 

i=l...l j=i+l...l 

^ _ ' ' i in - 1) 

i=l j=i+l 

m 

Proof of theorem 4: Now let us consider the update secrecy game. Deletions at a same location being 
obviously indistinguishable, the adversary will return in the "find" phase operations of type insertion or 
replacement. 

Let us now consider the "guess" phase. Notice that when the challenger applies one of the two operations 
on the ciphertext, a certain amount (at most 2(d — 1)) of adjacent ciphered blocks change while the 
corresponding plaintext blocks are unchanged. It is important to make clear that this is only due to the 
changing of the 2(d — 1) adjacent input windows to the random function. Indeed, considering a challenge 
update of type (insert, i, P*), the returned updated ciphertext C contains at most 2d — 1 new windows. 
One of them, w', serves to cipher the inserted block and we call the other ones "the adjacents". Let G" be 
the multiset of windows of G\ augmented by the new adjacent windows (w' j )j =i -d+i...i and (tUj-)j=»+i-d-i- 
We denote D" the event that w' is distinct from the windows of G u . 

Now, let us consider the event D u . When the event D u occurs, whatever the operation choosed by the 
challenger is, this new ciphered block is indistinguishable from a new random draw from {0, 1} 8N . This is 
due to the fact that the output of the random function is really a new random draw, and not a reused one. Just 
as in the previous proof, by denoting again S the event of success of A in the game Expt^Q PA (A, k), 
we have 

|Pr(S)-l/2| < -Pr(T>) 
2 

We fix an arbitrary order for the elements of G u , that is to say, we write G u = (u>i)i=i... u *+2(d-i)- We 
denote d i the event that w' = Wi for % e [1, u* + 2(d — 1)]. The events (di)i=i... u *+2(d-i) are not mutually 
exclusive, but we can upper bound the probability of occurence of D u as follows: 

fJl * + 2(d-l) 



Pr(D u ) < 



2 ed 



Why regenerate all the random blocks in the window w': Taking the example of a replace op- 
eration, assume that we do not regenerate all of them but only one, the first one. Consider the value 
w = Ti || r i+ i || . . . ||r i+( i_i before an update and the corresponding value w' = r'A\r i+ i\\ . . . \\r i+ d-i after an 
update. The probability that w' equals w is exactly l/2 e . In this case update queries are more advantageous 
for the adversary than encryption queries and annihilates the indistinguishability of the scheme. We conclude 
that all the random blocks that serve to encrypt the inserted (or replacement block) need to be regenerated, 
implying a change in the 2(d — 1) adjacent windows. This leads to the reencryption of 2{d — 1) unchanged 
plaintext blocks, a necessary overhead of encryption. 



B. mcXOR 

Stochastic processes background: Let X\, X 2 , ... be i.i.d. random variables drawn according to 
U{\\, L\) and define the sum S T = J2 i= i X%- Consider the first time T where S T > I. We call T the 
stopping time for the stopping rule mm T (S T s.t. S T > I). The following hold: 

• Wald's first equation: E(S T ) = E(T)E(X 1 ), 

• Wald's second equation: E{{S T - TE^)) 2 ) = V(Xi)E(T). 
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We deduce the followings: (i) Fact 1. According to the first equation, we have ^ < E(T) < (m) 
Fact 2. By developing the second equation and noticing that Tl < TSt < T(l + L) we have the following 
rough upper bound E(T 2 ) < 4 (f) 2 + 20 £ + 9. 

Proof of theorem 5: The reasoning being almost the same that for swrECB we give only the important 
steps. First of all, we suppose negligible the time taken to partition the bytestrings. Let us denote by r' 
and r" the random vectors used to encrypt respectively the parts of length V and I", where r', r" are 
drawn randomly from {0, 1} 8N and I', I" are drawn randomly from {1,...,L}. We say that r' and r" 
overlap in values if there exist k', k" such that r' + k' — r" + k" where k' E {0, 1, . . . , — 1} and 
k" E {0, 1, ... , — l}. Let us denote by Pr ovcr i ap the probability of occurence of such an event. It is 
clear that Pr ovcr i ap < ^g£r. Continuing, we need to use a certain number of notations: 

• T C: i is the number of parts in the i-th encryption query, 

• T U j is the number of parts in the i-th update query, 

• Ti is the number of parts in the challenge ciphertext, 

• d is the multiset of random vectors involved during encryption and updating oracle accesses. We 

notice that G 1 is of size YZi T e,i + EtT r «,<- 

• G 2 is the multiset of random vectors involved during the encryption of the challenge. This one is 
then of size TJ. 

Let Di be the event that none of the vectors of G 2 overlaps a vector of Gi, and let D 2 be the event that 
none of the vectors of G 2 overlap. As previously we only need to be concerned with D = D 1 U D 2 : 

• We use the update run time bound (inequation (1) Section IV-D) and fact 1 to evaluate the probability 
of occurence of D x . By applying several times the union bound it follows that: 

(Qe Qinc \ 

£ E(T C)i ) + E ( T «.i) E ( T ') Pr overlap 

i=l i=l / 

N2 8N ' 

• We use the fact 1 to evaluate the probability of occurence of D 2 . By the union bound it follows that: 

"- + 40/ + 18L) 



Pr(D 2 ) < E(Tf) Pr over i ap < 



N2 m 



Proof of theorem 6: In the update secrecy game, the adversary will return in the "find" phase operations 
such as (substitute, i, j, pb)b=o,i where \/3q\ = > 0. Indeed, the case |/3 0 | = = 0 yields the same 
updated message in both cases. Let us now denote by T uc the number of new parts produced during this 
update. According to inequation (1) Section IV-D we have E(T UC ) < + | + 3. We need to split T uc 
in two times T UCj i and T UCy2 where T UCi i corresponds to the time taken to produce the parts containing the 
modified portion and T UCj2 the time taken to resynchronize the random walks, that is, the parts containing 
unmodified data in the message. Considering the challenge update, let G3 be the multiset G\ augmented by 
the random vectors used to encrypt the parts that serve to resynchronize the random walks and let G 4 be 
the multiset of the random vectors used to encrypt the parts containing the modified portion of the message. 
G 1 is of size J2ti T ^ + EtT T «,* + T uc,2 and G 4 is of size T MCjl . 

Let D 3 be the event that none of the vectors of G3 overlaps a vector of G4, and let D 4 be the event that 
none of the vectors of G4 overlap. As previously we only need to be concerned with D = D 3 U D4. 

• We note that E(T UCj2 ) is maximized when E(T UCj i) is minimized so that E(T UC>2 ) < L/2 + 3. Then, 
upper bounding the various average times, we deduce the following: 

(Qe Qinc \ 

£ E(T 6) j) + E (T u , t ) E(T UC>2 ) Pr ovcrlap 
i=i i=i / 

Af4(l + L) 
N2 8N ' 
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• Here again, using the fact 2 and applying the union bound 

Pr(^) < E(T u 2 Cil )Pr ovcrlap < 



2 ^ v" l ■ 40n uc + 18L) 



7V2 8Ar 

■ 

Proof of theorem 7: 

The assertion of this proposition is obvious. If we consider the blocks' lengths in the partition of 
the plaintext b\b 2 . . . bi^ibi/3bjbj + i . . . b n -ib n performed by mcXOR.^(6i6 2 • • • h-ihPbjbj+i . . . b n -ib n ) or 
mcXOR.l K ((substitute, i, j, /3), D, C), we see a sequence of terms independently and identically distributed 
from a discrete probability distribution, except the last one (due to the termination condition in the 
algorithm). Its probability of occurrence is conditioned by the length of the document and the summation 
of all previous drawn terms, what still allows to say that the probability distribution of this latter term is 
the same in both cases. Finally, in either case, each variable-length block is ciphered thanks to the mode 
of operation XOR independently of the others. 



C. swrXOR 

Proofs of theorems 8 and 9: Let Wi = r^rj+i . . . r i+e -\ be the 2 d -bit encoding of the i-th sliding 
window, that is, the window used to encrypt the i-th part. Let us denote by k and lj the lengths of the 
i-th and j-th parts, where l { and lj are drawn randomly from {1, . . . ,L} and % ^ j. Let us also denote 

by S' the set {0, 1, . . . , \ \ - 1} and by S" the set jo, 1, . . . , - lj. We say that w t and Wj overlap 

k" where k' G S' and k" e S". Let us denote 



overlap such an event. It follows that Pr 



Pr 



tWi + k' 


= Wj + 


overlap 


< Pr 



3 k' e S', k" e S" s.t. \ Wi -Wj\ = \k" - fc'l 



< 



Wi 



< L/N 



. Two cases may occur: 



\j — i\ > e, it is then clear that 



Pr[\ Wi - Wj\ < L/N] < 



L 



N2 m ~ 1 ' 



(2) 



i\. Assuming that j > i, let the following 2 d -bit 

i-,r i+1 . . .r i+0 _i, A i+e = r i+e . . .r i+Q+e _i and Z = r i+0 . . . r i+e -i. The string Z 

d\& O 1 j /-\\~ oirvnifi^oMto I I' /-\-f* /in i~T-» in i o i~T-» d ( ^~)d\& O 



1 < I j — i\ < e— 1 then let us denote o = \j 
encodings A { = nr i+ i . . . r i+0 _i, A i+e 

corresponds to the (2 d ) e ~° lower significants bits of Wi, while this is the (2 
bits of w i+e . We can rewrite w { and w i+0 in the following way 



most significants 



Wi 



w 



i+o 



= Z{2 d )° + A t+e 

so that \w i+Q - Wi\ = \A i+e - Ai2 d{ - e -°^ + Z (2 do - l) |. Note that the support of A { and A i+e is 
[0, (2 d y — 1]. We are interested in the support of the random variable X i+e = A i+e — Ai (2 d ) 6 °. 
One can observe the followings: (i) if o < e—o its support is a set of exactly 2 2do values on which 



i+e 



is uniformly distributed; (ii) if o > e — o its support is a set of at least 2 do values. More precisely, some 
of the elements taken by X i+e can indeed be described by two distinct couples (A i: A i+e ), (A[, A' i+e ) 
with Ai 7^ A\ or A i+e ^ A' i+e . Consequently, for any value x of this set we can upper bound its 
probability of occurence by l/2 do ~ 1 . Since our interest is to upperbound Pr [\wi — Wj\ < L/N], we 
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continue in the following way by denoting U = 2 do — 1 for a convenient display 



Pi[\w i+0 -Wi\ < L/N] <Pr 



Pr 



\x l+e + zu\ < 



\x l+e + zu\ < 



N 
L 
N 



L 

N 
L 

N 



<2d(e — o) _ 

* E 

z=o 



Pr 



\x t+e + zu\ < 



Pr 



<Pr 



Pr 



Pr 



X. 



i+e 



u 



+ z 



< 



i+e 



< 



L 

N 



X l+e + U\< 



X 



i+e 



u 



< 



NU 

L 

N 
L 
N 
2L 



< U 

> U 

L 

N 

L 

N 



+ 



L 

N 



< U 



Pr [Z] + 



> U 



< u 

L 

N 



Pr [Z = 0] + 
< / ' Pr [Z = 1] + 



NU 



< 



2L/N 
2 de ~ 1 



+ 



AL/N 



{2 do - l)2 d ( £ 



L 

N 

< 



> U 



6L 

N2 SN ~ 



Whatever the case is, using 2 and 3 we conclude that Pr [overlap] 
similar to that provided for mcXOR except that the multisets Gi, G 2 



< 



7L 



(3) 

T .The rest of the proof is 



N 2 8N- 

G 3 and G 4 are composed of the 
sliding windows used during the queries phase. In addition to the overhead of encryption implied by the 
synchronization of random walks during an update, we need to take into account the overheads implied by 
the sliding windows which force us to reencrypt 2(d — 1) unchanged parts. Since these additionnal parts 
bring more information to the adversary, we just have to reestimate the cardinal of Gi to deduct an upper 
bound of insecurity for the message secrecy. In the same way, by reestimating the cardinal of G 3 we deduct 
an upper bound of insecurity for the update secrecy. ■ 
Proof of theorem 10: As regards the distribution of blocks' lengths, the proof is exactly the same 
that for mcXOR. The exception is, whether we consider the encryption or the update, each variable-length 
part is ciphered with an input vector which actually is a sliding window (ci-wise chain) over a string of 
random elements. The way in which the parts are ciphered is therefore the same in both cases. Consequently, 
encryption and update output ciphertexts that are perfectly indistinguishable. ■ 



